Reading notes: N-ary Relation Extraction with Multiscale Representation Learning

1 minute read

Published:

Notes about reproducing result of N-ary RE with Multiscsale Representation Learning

Link of the source paper: http://arxiv.org/abs/1904.02347

1. Problems that the paper is trying to solve?

1. Improve N-ary RE by improving both Recall and Precision:

  • To improve recall, widen the span by learning representations from different scale
    1. throughout the document
    2. across the subrelation hierarchy. For example, sometimes a drug-gene-mutation relationsh may never cooccur in one span. But if we can find gene-mutation relation in previous paragraph first, and drug-mutation relation in later paragraph, even with a long distance in between, we can still infer the N-ary relation.
  • To improve precision
    1. add the weak signal.
    2. use entity-centric formulation, instead of conventional mention-centric.
    3. allow discontiguous span of texts containing the entity mentions, instead of contiguous only.

2. Limitations of current practices?

1. Current RE datasets/models only deal with small spans/local relations, not long documents:

  • It will miss those existing far apart. Causing lower recall.

2. Most of the current appraoches deal with Binary relationsh instead of N-ary:

  • Cannot satisfied the increasing demand such as drug-gene-mutation interaction

3. Advantages of using entity-centric formulation

1. Computationally efficient:

  • One entity can be mentioned many times across document

2. Increase precision:

  • Many of the mentions of single entity are noise

9. Modification of the source code

  1. instead of pip install torch=1.0.0, I had to use pip install torch==1.0.0 torchvision==0.2.1 -f https://download.pytorch.org/whl/torch_stable.html